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■ Abstract 

CD 

] A Boolean function / over n variables is said to be g-locally correctable if, given a black-box 

■^j- ■ access to a function g which is "close" to an isomorphism f a of /, we can compute f a (x) for 

CN ! any leZ^ with good probability using q queries to g. 

We observe that any fc-junta, that is, any function which depends only on fc of its input 
'. variables, is 0(2 fe )-locally correctable. Moreover, we show that there are examples where this 

is essentially best possible, and locally correcting some fc-juntas requires a number of queries 
which is exponential in fc. These examples, however, are far from being typical, and indeed we 
^ ■ prove that for almost every fc-junta, O(fclogfc) queries suffice. 
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■ 1 Introduction 

\Q . The field of property testing of Boolean functions received a considerable amount of attention in 

the last two decades. Many properties of functions have been examined in order to estimate what is 
the needed query complexity for testing them, that is, the number of inputs of the function one has 
to read in order to distinguish between a function that satisfies the property and one that is "far" 
from satisfying it. In particular, one is usually interested in properties for which the number of 
queries is independent of the input size. Some of these properties are linearity [7J, being a dictator 
function [HE], a junta [HI [TO], or a low-degree polynomial [2]. 
rS ' 

Another property that one might consider testing is functions isomorphism, i.e. testing if two 
functions are identical up to relabeling of the input variables. A common scenario is where one 
function is given in advance and the goal of the tester is to determine if the second input function 
is isomorphic to it or far from any isomorphism of it. Several recent results indicate that testing 
this property is hard for most functions (requires Q(n) queries), and specifically for /c-juntas there 
are lower bounds which depend on k (see e.g. [IHdJE]). 

The focus of our work is not testing such properties, but rather locally correcting functions, that 
is, determining the value of a function in a given point by reading its values in several other points. 
This is closely related to random self-reducibility, as pointed out already in [7j. More precisely, we 
care about locally correcting specific functions which are known up to isomorphism. 
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Question. Given a specific Boolean function f , what is the needed query complexity in order to 
correct an input function which is close to some isomorphism f a of f? 

This question can be seen as a special case of locally correctable codes (see, e.g., [12] )• Namely, 
each codeword would be the 2 n evaluations of an isomorphism of the input function (at most n! 
distinct codewords) and we would like to correct any specific value of the given noisy codeword 
using as few queries as possible. 

Here we study the above question mostly for juntas. We provide both lower and upper ex- 
ponential bounds for the query complexity of locally correcting juntas with respect to their size. 
However, the given lower bound is applicable only to a small portion of the juntas and in fact we 
show that most fe-juntas are locally correctable using a nearly linear (in k) number of queries. 

1.1 Preliminaries 

In order to correct functions, we need to first define when two functions are "close", as otherwise 
correction is hopeless. We use the common definition, saying two Boolean functions are e-close if 
they agree on all but at most an e fraction of the inputs. The following definition best describes 
the focus of our work, indicating when a function is locally correctable. 

Definition. A Boolean function f : Z?; i— > Z2 is said to be q-locally correctable for e > if the 
following holds. There exists an algorithm that given an input function g which is e-close to an 
isomorphism f a of f , can determine the value f a {x) for any specific x € ZJ? with probability at least 
2/3, using q queries to g. 

More generally, we also define a family of functions to be locally correctable when we do not 
require to know which specific function from the family we are trying to correct. 

Definition. A family T of Boolean functions is said to be q-locally correctable for e > if the 
following holds. There exists an algorithm that given an input function g which is e-close to an 
isomorphism f a of f , for some f £ T from the family, can determine the value fa{x) for any 
specific x € Z?j with probability at least 2/3, using q queries to g. 

A crucial observation when looking at the above definitions is the fact that the mentioned 
algorithm must hold for every input x £ ZJ? . Replacing this requirement by the ability to determine 
the value at a uniform random x, any function would be trivially 1-locally correctable for e < 
1/3 as at least 2/3 of the inputs remain unmodified. In addition, it is useful to think of e as a 
constant independent of n, which however can depend on some property of the function /. This 
dependence is often required to ensure that g is close to a unique isomorphism of / (up to equivalent 
isomorphisms). 

A simple result regarding juntas is an exponential upper bound for the number of queries, in 
terms of the junta's size. For this upper bound, we use the analysis of testing low-degree polynomials 
and therefore get the following more general bound. 

Proposition 1. Every polynomial of degree k is O(2 k )-locally correctable for e < 2~ fc ~ 3 . 

Proof sketch. The techniques used in testing low-degree polynomials rely on their values on the 
points of random affine subcubes inside ZJ? which are defined by random bases of k + 1 vectors and 
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an offset in Z?; (see [2]). Taking such a subcube and evaluating the sum of a degree k polynomial on 
all 2 k+1 elements of it, always results in zero. The test itself selects several such random subcubes 
and verifies that this is indeed the case. Since in our case, we are given some specific input iGZj 
for which we want to correct the function, we can use a similar argument. 

Given the input x, we randomly select k + 1 vectors x±,X2, ■ ■ ■ ,Xk+l and consider the affine 
subcube whose basis is the set of these k + 1 vectors and whose offset is x. Since the sum of 
evaluations inside this affine subcube (which includes x) is zero, we can deduce the value at x by 
querying the other 2 k+l — 1 locations of the cube, assuming none of them was modified. Since we 
select the k + 1 vectors in the basis randomly, relying on the (easy case in the) analysis of [2] which 
is based on the fact that each input queried is uniformly random, we can bound this probability by 
(2 fc+1 — l)e < 1/4 and therefore this algorithm indicates that / is indeed 0(2 fc )-locally correctable 
for e < 2~ k - 3 . □ 



Corollary 2. The family of k-juntas is 0(2 k ) -locally correctable for e < 2 
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Proof. A £>junta is in particular a polynomial of degree k and therefore is also 0(2 fc )-locally cor- 
rectable using the above proposition. In addition, the algorithm suggested by the proposition does 
not require any knowledge about the input function except for it being a polynomial of degree k, 
thus the family of fe-juntas is 0(2 fc )-locally correctable. □ 



1.2 Our results 



A natural question is whether the exponential upper bound for low-degree polynomials, which is 
applicable also for juntas, is indeed tight in the case of juntas. We show that the answer is indeed 
positive, but only for a small fraction of the juntas. In other words, for some juntas the exponential 
upper bound is also best possible but this is far from being the typical case. 

Theorem 3. There exist some k-juntas which require 2^ k ^ (adaptive or non- adaptive) queries in 
order to be locally corrected, even for e which is exponentially small in n. 

In the typical case, however, i.e., for almost every junta, the lower bound above is far from 
being tight and in fact one can correct a typical fc-junta using a nearly linear number of queries (in 
k). Formally, we prove the following. 

Theorem 4. A k-junta in which every influencing variable has influence of at least 1/50 is 
0{k log k) -locally correctable for e < 2 _fe_3 . Therefore this is the case for almost every k-junta. 

Corollary 5. The family of k-juntas in which every influencing variable has influence of at least 
1/50 is O (k log k)- locally correctable for e < 2 _fe ~ 3 . 



2 Local correction of /c-juntas 



We start this section with the proof of the lower bound for some juntas. The juntas used in the 
proof are very sparse, having an exponentially small fraction of inputs for which the value of the 
function is 1. 
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Proof of Theorem [J Given k < n E N where n is even, define / to be the AND function of the 
first k literals x±, . . . , x^. In order to prove a lower bound for the number of queries, we use Yao's 
principle. To this end, we define two distributions on functions which are all o(l)-close to being 
isomorphic to /, one for which the algorithm should return zero and another for which the algorithm 
should return one (denoted by Vq and T>\ respectively). We further show that any algorithm that 
performs only 2°^ queries would not be able to distinguish between the two distributions with 
non-negligible probability. 

We first describe the distribution T>q as follows. We randomly choose a permutation a E S n so 
that o~(i) E [n/2] for every i G [k], meaning the k relevant variables are all in the first halfQ The 
function g given to the algorithm is defined by g(y) = f a (y) whenever the Hamming weight of y 
is at most 0.3n in each half (i.e. X^=i Vi — and 2\n=n/2+i V* — 0-3n) and otherwise g(y) = 0. 
Notice that indeed g is o(l)-close to being isomorphic to / as we modified only an o(l)-fraction of 
the inputs. The input x is set to be the balanced input of n/2 zeros followed by n/2 ones. Clearly 
f a {x) = for every instance in Vq as required. 

The distribution T>\ is similar to Vq with one modification. The permutation a is chosen so 
that o~(i) [n/2] for every i E [k]. The choice of x and the locations where we fix g(y) = are 
defined as before and indeed f a (x) = 1 for every instance in T>\. 

We first show that an arbitrary query to g in either distribution would output one with proba- 
bility at most 2~^( fc ). Let y be some query the algorithm performs. Clearly, if the Hamming weight 
of y in either half is more than 0.3n, the result would be zero according to the definition of g in 
both distributions. Otherwise, the probability that g(y) = 1 is given by 

(m\ (n/2\ (m-k + l)(m-k + 2)---m I '3n/10\ fe _ k 

V k ) ' \ k ) ~ (n/2 -k + l)(n/2 - k + 2) • • • (n/2) ~ \ n/2 ) ~ ' 

where m is the Hamming weight of y in the relevant half (either the first half for Dq or the second 
half for T>i), which is known to be at most 0.3n. Therefore, any algorithm that performs at most 
2<5(fc) q uer i es W ould find a y for which g(y) = 1 only with negligible probability, and it would not be 
able to distinguish between T>q and T>\ with noticeable probability. Notice that the proof implies 
that using an adaptive algorithm would not yield any improvement as we can predict all results to 
be zero in advance (and therefore this is equivalent to a non-adaptive algorithm). □ 

The fact that the AND junta is very sparse was crucial for the above proof. In order to prove a 
better upper bound for most juntas, we need some restriction that would ensure the function is far 
from being sparse. In Theorem [J] we required something even stronger, that the influence of every 
influencing variable, that is, any of the k special variables of the junta, is at least 1/50. 

Definition 1 (Influence). Given a Boolean function f : i-> Z2, the influence of i with respect to 
f is defined by 

Inf i (/) = Pr [f{x)^f{x + e l )\ 

X 

where e.; is the vector having 1 only at location i. 

Thus, the influence is the probability that changing the value of the ith variable will also 
change the value of the function. This probability is taken over all values of x, and is therefore 
the expected influence of i in a restricted function (when the variable i itself is not restricted). 

1 Throughout this work we use the notation [£] := {1,2,..., £}. 



4 



Moreover, if the influence of some variable i is greater than 1/50, then the function is 1/100-far 
from being a constant. 

Given a random /c-junta for which the influencing variables are the first k variables, the influence 
of some variable 1 < i < k is determined by the bias of the 2 fc ~ 1 pairs of inputs of length k which 
differ only in the ith variable, where the values of all other variables in [k] range over all possibilities. 
The expected bias is hence 1/2, and moreover 

Pr[Infi(/) < 1/50] = Pv[B(2 k ~\ 1/2) < 2^/50] < 2~ c2 " 

for some absolute constant c > 0, where here B is the binomial distribution and we applied one 
of the standard estimates for binomial distributions (cf., e.g. [3], Appendix A). Therefore, by the 
union bound, the k influencing variables would all have influence greater than 1/50 with probability 

1 _ 2 -^(2 fe ). 

Now that we defined the influence of a variable and verified that indeed almost every junta 
satisfies the condition in the theorem, we describe the proof of Theorem SJ 

Proof of Theorem [^} Let / be a fc-junta as in Theorem^! and let g be the given input function which 
is e-close to f a (we assume e < 2 _fc_3 in order to guarantee that g is close to a unique isomorphism 
f a ). Following the basic approach in the known junta testing algorithms (see e.g. [10} 15]). we intend 
to randomly divide the variables into parts and identify which sets have influencing variables. Here 
however, mistakenly identifying a set to have influencing variables (due to fault evaluations of 
the input function) or having more than one such variable in a part is not an essential issue (as 
estimating the number of influencing variables is not our goal). 

Fix s = 3k and partition the set [n] into s parts uniformly at random, by assigning to each 
variable one of the s sets independently. For each set we perform r = 100 log A: + 500 pairs of queries, 
where each pair (x, x') is chosen independently and uniformly at random such that x and x' agree 
on all elements outside of the current set. When a set has at least one influencing variable (with 
influence at least 1/50), each such pair would yield different outcomes with probability at least 
1/100 (as the randomly restricted function over the variables outside of the current set is expected 
to be at least 1/100-far from being a constant). Therefore, the probability we would dismiss such 
a set is at most 0.99 r < 0.5 logfc • 0.99 500 < 1/100/c (assuming we did not hit a faulty evaluation - 
a probability that we later consider). Since there are at most k sets with influencing variables, by 
the union bound we would identify them all with probability at least 99/100. 

In order to estimate how many sets we would consider as influencing, we compute the probability 
that a non-influencing set would mistakenly be considered otherwise. This can only occur if we 
query the function at a faulty input, which happens with probability e. During this process, we 
perform only sr = 0(klogk) pairs of independent non-adaptive queries and therefore we would hit 
a faulty evaluation only with probability 0(k log k/2 k ) = 2~ n ( k \ 

So far with good probability we have identified at most k sets which are influencing. Let 
Si , . . . , Sfc denote these k sets (where we add some arbitrary randomly chosen sets if less than k 
were found) and define S = U k =l Si to be their union (notice that the size of S is expected to be 
E [\S\] = n/3). Given the input x for which we were asked to determine f a (x), we would like to 
choose an input y which agrees with x on all indices from S, and yet is uniformly distributed (except 
for the restriction to match x on the k influencing variables). Achieving this would guarantee that 
the probability of y hitting a faulty evaluation is at most 2 fc+1 /2 fc+3 = 1/4 (even if all faulty 
evaluations fall into inputs which agree with x on these k variables). 
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Let p = 3/4 and define y so that yi = for every i G S, and otherwise Pr[yj ^ Xj\ = p. 
Whenever i is not one of the special k variables, Pr[yj ^ = Pr[i ^ 5] • p = | • | = ^ and these 
probabilities are all independent. The independence between the different variables and the fact 
that Pr[i S] is exactly 2/3 are crucial points which require a moment of reflection to verify. This 
implies that indeed y is uniformly distributed over all inputs which agree with x on the special k 
variables, as required. 

Combining the two parts together, the algorithm would return the correct answer g{y) = f a (x) 
with probability at least 3/4 - 1/100 - 2 _n ( fc ) > 2/3 (for large enough k). □ 

Proof of Corollary The algorithm provided here did not use any specific knowledge of the func- 
tion / except for the guarantee of its structure, being a fe-junta in which each influencing variable 
has influence at least 1/50. Therefore, this family is 0(k log /c)-locally correctable for e < 2 _fc_3 . 
Hence, for every fixed k this family forms a locally correctable code which has polynomial size in 
n and constant number of queries O(klogk). □ 



3 Conclusions and open problems 

In this work we have shown that fc-juntas and degree k polynomials are g-locally correctable, where 
q depends on the structure parameter k of the function, and not on the number of variables n. 
We have also seen that q is always at most 0(2 k ) and that sometimes an exponential behavior is 
tight. The main general open question in this subject is that of computing the query complexity 
of local correction for any given function. In particular, it would be very interesting to find a char- 
acterization of all functions that are "easily" correctable, that is, have constant query complexity 
(independent of n). 

Although we have seen both upper and lower bounds for juntas and polynomials, the lower 
bound is only applicable for specific functions. Symmetric functions, for example, are always 0- 
locally correctable as one does not need to query the function at all in order to correct an input. 
However, such functions can have arbitrary large degree as polynomials. Taking the majority func- 
tion as an example, it is trivial to correct and yet it has high degree, but even a slight modification 
of it, Maj„„ 1 - the majority of n — 1 of the variables, makes it hard for correcting. Given Maj n „ l5 
one can modify o(l) fraction of the inputs, namely the balanced layer, and make this function 
impossible for correcting in any number of queries. 
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